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residual  varlMca  constant  across  groups,  3)  indapendant  priors  for  regres¬ 
sion  paranatars,  4)  scales  of  tha  predictors  at  the  grand  naan  rather  than 
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St  the  so-called  "Ideal  scaling  points",  5)  transfomstion  of  all  variables 
Inclodlns  the  criterion  to  nean  aero  and  variance  one  at  the  start  of  the 
calculations,  6)  an  laproved  search  procedure  for  finding  the  node  of  the 
posterior  distribution.  This  new  solution  provides  a  greatly  speeded-up 
algoritfan  nhich  dafines  the  posterior  node  precisely. 
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1.  Summary 

When  multiple  regression  equations  are  to  be  estimated  for  m 
groups  which  are  supposed  to  be  comparable  though  not  identical, 
both  the  pooled  estimates  and  m  separate  least  squares  estimates 
per  group  may  be  suboptlmal.  Lindley,  Novlck,  Jackson  and  others 
have  advocated  a  Bayesian  estimation  procedure  in  which  the  estimates 
would  be  weighted  averages  of  the  separate  estimates  per  group  on 
one  hand  and  some  pooled  estimate  on  the  other  hand,  with  weights 
determined  essentially  by  the  data.  This  extension  of  the  Kelley 
formula  for  regression  to  the  mean  has  proven  its  value  in  several 
cross-validation  studies  (Novick,  Jackson,  Thayer  &  Cole,  1972; 
Lissitz  and  Schoenfeldt,  1974;  Shigemasu,  1976;  Jansen,  1977).  The 
modal  posterior  values  for  intercepts,  slopes  and  residual  variances, 
however,  are  not  easy  to  obtain.  The  procedure  outlined  by  Novick 
et  al.  (1972)  and  Jones  and  Novick  (1972)  still  poses  some  numerical 
and  methodological  problems.  The  present  paper  presents  a  modified 

algorithm  removing  most  of  the  deficiencies.  It  remains  true, 

*Supported  in  part  under  ONR  Contract  #N00014-77-C-0428,  Melvin  R. 
Novick,  principal  investigator.  Opinions  stated  herein  are  those 
of  the  authors  and  not  those  of  the  supporting  agency. 


however.  Chat  m-group  regression  is  an  example  of  a  Bayesian  model 
In  which  it  is  somewhat  difficult  to  specify  a  vague  prior  that 
would  let  the  data  and  the  collateral  information  speak  for  them¬ 
selves. 

The  major  features  of  Che  new  approach  are: 

(1)  Che  use  of  parameters  with  constant  values  for  slopes 
across  groups  whenever  the  prior  or  the  data  indicate  chat  this  is 
desirable, 

(2)  Che  use  of  residual  variance  constant  across  groups, 

(3)  independent  priors  for  regression  parameters, 

(4)  scaling  of  the  predictors  at  the  grand  mean  rather  than 
at  Che  so-called  "ideal  scaling  points"  mentioned  in  Novick  et  al. 
(1972), 

(5)  transformation  of  all  variables,  including  the  criterion, 
to  mean  rero  and  variance  one  at  the  start  of  the  calculations, 
with  return  to  the  raw  scaling  only  for  display  of  results  to  the 
user  or  for  questions  to  the  user. 

Section  2  of  this  report  gives  a  description  of  the  old  model 
(used  by  Novlck  ec  al.  1972)  and  a  schematic  comparison  with  the 
new  model.  Section  3  describes  the  old  iterative  algorithm  for 
obtaining  modal  posceriok  estimates  and  its  subsections  3a,  3b, 
and  3c  deal  with  the  defieiancies  of  that  algorithm.  Section  4 
with  Its  subsections  4a  through  4a  discusses  Che  revisions  on 
which  ths  new  model  is  based.  Section  S  then  outlines  the  new 
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model  and  derives  the  corresponding  equations.  Section  6,  which  is 
as  far  as  possible  independent  of  the  preceding  material,  contains 
some  information  for  users  of  the  m-group  regression  program,  and 
a  final  section  7  discusses  possible  future  extensions. 

2.  Old  model  specification 

In|  the  model  used  by  Jones  and  Novick  (1972)  and  by  Novlck  et 
al.  (1972)  for  simultaneous  regression  in  m  groups,  q  first  stage 
describes  how  the  criterion  is  distributed  given  the  regression 
parameters  and  given  the  predictor  values.  Considering  the  groups 
as  exchangeable,  the  next  stage  treats  the  regression  parameters 
(including  Intercept  and  residual  variance)  as  a  random  sample  from 
some  distribution,  characterized  by  unknown  hyperparameters.  A  third 
stage  specifies  some, rather  vague, information  on  these  hyperparameters 
(c.f.  Llndley  and  Smith,  1972). 

The  stages  are  described  in  Novick  et  al.  (1972)  and  summarized 
below  side  by  side  with  the  new  model  which  will  be  discussed  in 
sections  4  and  5.  In  both  models  the  data  for  the  J-th  individual 

out  of  the  n^  individuals  in  the  i-th  group  (i  >  1,  2 . .  m) 

consist  of  a  criterion  score  and  scores  on  £  predictors 
(k  ■  1,  2,  ...,  i;  J  ■  1,  2,  ...»  n^).  In  each  (I  +  1)  x  n^^  matrix 
of  predictor  scores  we  include  a  row  of  ones  for  the  intercept. 

For  the  new  model  the  index  set  (0,  1,  £)  is  partitioned  into 

two  disjoint  subsets  F  (parameters  conason  to  all  groups)  and  G 
(parameters  different  across  groups). 


4 


TABLE  1 


OLD  MODEL 
First  stage; 

♦!> 

Sscond  stage: 

(Boi*  h”^) 

♦^4x”^(v,  va^) 


Third  stage ; 

2 

And  log  o^  uniform  (-•,«•); 
HAWishsrt  (v'.E,  1  +  1); 
t  disgonsl  matrix, 

!?s»  stKHild  supply  (s^s  bsliuAr): 

V*  Csms/l) 

diagonal  slenents  of  £ 


NEW  )«»EL 


geG  VVj’ 


Bf -ii  uniform  (-•*,  •) ; 

6glA»(kg.  V' 

log  4  A  uniform  (-•,  •); 


u,/L  uniform  (-*»,•»); 
8  -2 

♦gA  X  (v'.v't^). 


V  (small) 
T 

8 


This  schamatie  prsssntation  is  restrletsd  to  tha  assantlals. 


Indapandanea  assumptions »  conditionings  and  ranges  of  indieaa  are 
deaeribad  mora  fully  In  Novich  at  al.  (1972)  for  tha  old  modal  and  in 
saetions  4  md  5  for  tha  new  ona.  For  the  old  modal,  Lindl^  (1970) 


details  how  integration  over  the  hyperparameters  leads  to  a  posterior 
density  for  the  regression  parameters  given  the  data.  Up  to  an 
additive  constant,  its  logarithm  is  (Lindley,  1970,  formula  11); 
log  p 

-I  +  1)  log  -  I,  (yy  -  E  (1) 

-  1)  log  I  v'o^^  +  E  (B^i  -  B^_)| 

“Jj(m  +  1)  log  log  {n(9”^  +tc)>. 

Here  6  and  n  denote  the  harmonic  and  geometric  mean  of  the  set 
respectively,  denotes  the  mean  across  i  of  B^^j^  and  sey» 

denotes  the  determinant  of  an  (i  +  1)  x  (i  +  1)  matrix  A  with  elements 
a^j^.  The  constant  <  is  introduced  to  insure  convergence  (Lindley, 

1970,  page  3).  For  1  predictors  and  m  groups,  (1)  is  a  function  of 
(i  +  2)  m  parameters.  Its  maximization  leads  to  the  desired  posterior 
modal  estimates,  but  it  poses  some  problems. 

3.  Problems  of  the  old  model 

The  computer  programs  made  available  by  Novick  et  al.  seek  the 
maximum  of  (1)  by  the  following  iterative  procedure.  An  initial  set 
of  estimates  should  be  computed  first;  one  might  take  the  least 
squares  estimates  per  group,  the  least  squares  estimates  for  the  pooled 
saaq>le  or  the  so-called  model  II  estimates,  see  below.  Equating  the 
derivatives  of  (1)  with  respect  to  Bj^^^  to  rero,  for  fixed  i,  leads  to 
a  sat  of  aquations  which-  are  linear  in  if  one  temporarily  considers 
(h  ■  0,  1,  ...  t),  and  the  determinant  as  fixed.  They  are 


successively  solved  for  each  i;  after  updating  means  and  determinant 


this  Is  repeated  twice.  Next  the  updated  values  for  all  6.  .  are  used 

hi 

to  obtain  new  by  equating  the  derivative  of  (1)  with  respect  to 
to  zero;  such  equations  are  linear  in  l/<|i^  provided  that  n>  B  and 
all  are  temporarily  considered  as  fixed.  This  whole  process  Is 
called  one  iteration  cycle,  and  such  cycles  should  be  repeated  until 
the  Increase  per  cycle  of  the  function  (1)  has  become  negligible. 

This  algorithm  has  been  used  in  several  applications  mentioned 
in  section  1,  but  not  without  problems: 

(a)  very  slow  convergence; 

(b)  non-robustness  against  choice  of  prior  values  for  v'  and  o..  ; 

hn 

(c)  non-robustness  against  initial  choice  of  estimates; 

(d)  suboptimal  determination  of  the  mean  value  B,  for  regression 

n  • 

parameters  for  which  almost  total  regression  takes  place. 

3a.  Slow  convergence 

The  type  of  very  slow  convergence  encountered  most  frequently 
consists  of  a  few  drastic  changes  in  the  first  cycles  followed  by  a 
slow,  decelerating  and  monotone  movement  of  each  B.  .  value  to  its 
limit.  As  explained  in  detail  in  Molenaar  (1978) ,  insertion  after 
every  Ath,  5th, or  6th  iteration  of  leaps  (extrapolating  from  the  past 
three  values  in  a  geometric  series  model  for  each  parameter  separately) 
typically  reduces  the  total  required  computer  tiaie  by  a  factor  of  2 
to  4;  in  exceptional  cases  a  trial  run  of- some  10  to  20  cycles 
including  leaps  could  be  examined,  after  which  a  change  of  the  default 
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values  of  the  leap  process  produces  a  fully  satisfactory  convergence. 

The  computer- time  Involved  In  the  bookkeeping  of  the  pre-leap  values 
Is  more  than  gained  back  because  one  efficient  leap  step  may  produce 
more  Improvement  of  the  goal  function  (1)  than  ten  or  even  fifty 
ordinary  Iterations. 

Since  publication  of  Molenaar  (1978)  the  leap  process  underwent 
two  simplifications.  First  of  all  residual  variance  estimates  were 
nearly  always  very  stable  across  Iterations,  and  therefore  no  leaps 
are  programmed  for  them.  Secondly,  after  the  first  few  Iterations, 
both  the  variances  of  the  regression  parameters  (across  groups)  and 
the  z-scores  obtained  by  standardization  across  groups  of  the  Individ¬ 
ual  parameters  In  each  group,  were  also  very  stable  across  iterations. 

The  revised  algorithm,  therefore,  calculates  leaps  only  for  the  means 
across  groups  of  the  regression  parameters.  This  means  that  the 
individual  values  at  each  iteration  cycle,  or  their  differences 
between  two  cycles,  need  no  longer  be  stored  for  calculating  the 
geometric  series  ratio  underlying  the  leap.  It  suffices  to  extra¬ 
polate  at  each  leap  the  past  three  mean  values  (across  groups)  of  each 
slope  and  the  intercept  (taken  at  the  grand  mean).  For  each  of  those 
means  a  value  is  extrapolated  from  the  geometric  series  model,  and 
the  after-leap  value  of  each  individual  estimate  is  simply  obtained 
by  translation  to  the  new  mean  value.  The  provisions  replacing  an  unsatis¬ 
factory  geometric  series  leap  remain  as  In  Molenaar  (1978),  with  the 
exception  that  the  default  values  now  are: 
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—  first  leap  after  5  cycles; 

—  each  leap  after  4  more  cycles; 

—  no  leap  if  mean  stable  in  3  leadinR  digits  in  last  cycle; 

—  leap  ®  20*  last  difference  if  difference  changed  sign  or  is 
almost  zero; 

—  leap  =  20*  last  difference  if  last  difference  larger  or  hardly 
smaller  than  preceding  difference; 

— stop  if  log  posterior  density  stable  in  5  leading  digits. 

The  previous  version  also  stopped  iteration  when  all  parameters 
were  stable  In  a  user-specified  number  of  digits.  This  provision  Is 
now  deleted,  because  it  was  almost  never  fulfilled  and  led  to  much 
bookkeeping  and  time  loss. 

A  FORTRAN  program  called  BR  is  available  in  which  all  parameters 
just  mentioned  can  be  manipulated,  as  well  as  a  few  others.  For 
regular  use  in  the  CADA  Monitor,  however,  it  is  doubtful  whether  a  user 
would  have  the  skill  to  gain  from  successful  manipulation  of  the 
parameters  as  compared  to  running  some  extra  iterations.  The  BASIC 
version  of  the  program,  called  BR1,BR2  therefore  fixes  all  parameters 
at  the  Just-mentioned  default  values.  For  the  exceptional  case  that 
manipulation  is  desired,  it  could  be  obtained  by  either  changing  the 
BASIC  source  deck  or  using  the  FORTRAN  version. 
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3b.  Prior  values  robustness  problem 

In  the  log  posterior  density  given  by  formula  (1) ,  the  quantities 
and  v'  should  be  provided  by  the  user,  as  a  description  of  a 
plausible  covariance  matrix  for  the  regression  parameters  and  an 
indication  of  the  uncertainty  associated  with  that  description  (smaller 
v'  Implying  greater  uncertainty).  In  Novlck  et  al.  (1972),  It  was 
advised  to  take  v*  *  1  unless  specific  prior  knowledge  is  available. 

It  was  also  advised  to  take  the  off-diagonal  elements  h  }*  k, 

equal  to  zero,  with  the  proviso  of  scaling  the  predictors  at  the  "ideal 
scaling  points"  described  in  Novlck  et  al.,  1972,  p.  37.  This  is 
because  the  Intercepts  can  only  be  considered  independent  of  the  slopes 
when  the  predictors  are  suitably  scaled.  The  problem  of  prior  specifi¬ 
cation  is  now  reduced  to  a  choice  of  values  for  the  diagonal  elements 

If  the  user  could  provide  prior  estimates,  say  for  the  variances 
of  it  was  advised  to  identify  these  with  the  prior  marginal  modes 

of  these  variances,  namely  v’  o.,/(v’  +  2).  For  v'  *  1,  this  leads 

nn 

to  the  specification  o..  *  3  t.  •  (This  point  is  discussed  In  section  4c.) 

nn  h 

As  a  practical  matter,  even  providing  values  could  be  difficult 
for  a  user  without  specific  prior  knowledge.  Therefore,  Novlck  et  al. 
(1972)  advised  setting  equal  to  the  corresponding  unbiased  sampling 
theory  estimates,  based  on  the  current  data,  for  the  variances  of  the 
regression  parameters.  The  development  of  these  model  II  ANOVA  estimates 
is  given  by  Jackson  (1972).  As  noted  by  Jackson,  Novlck  and  Thayer 
(1971),  there  are  two  difficulties  with  this  advice.  The  first  Is  the 


theoretical  point  that  prior  quantities  should  not  be  derived  from 
the  data  being  analyzed,  l^en  v'  *  however,  it  was  hoped  that 
the  precise  choice  of  would  matter  little  for  the  posterior 
distribution  of  the  regression  parameters.  In  this  light,  the  use 
of  model  II  estimates  may  be  seen  as  merely  a  convenient  shortcut. 

The  second  difficulty  is  a  practical  one:  the  model  II  estimates  may 
sometimes  be  negative.  In  this  case,  it  was  advised  to  select  a 
"small"  positive  value  for  As  with  the  first  poirit,  it  was  hoped 

that  the  precise  choice  would  not  be  too  important. 

The  robustness  of  the  final  estimates  to  variations  in  the  choice 
of  was,  in  fact.  Illustrated  for  a  simple  case  (10  groups,  1  predictor) 
by  Jackson  et  al.  (1971,  p.  140).  We  shall  now  consider  an  illustration 
chosen  to  show  that  this  robustness  is  not  always  so  apparent.  From 
the  25  percent  sample  of  the  1968  ACT  data  analyzed  by  Novick  et  al  (1972), 
12  of  the  22  groups  were  selected  (called  the  "12H0M0"  dataset  in 
Nolenaar,  1978).  Table  2  gives  the  modal  estimates  obtained  for  these 
data  when  different  t.  values  are  used.  For  easier  comparison,  the 

Cl 

estimates  for  the  12  groups  have- been  replaced  by  the  mean  and  standard 

deviation  of  those  12  values  for  each  of  the  regression  parameters. 

As  before,  1  and  o. .  ~  3t,.  were  used  for  all  estimates  In  the  table. 

nn  n 

Horeover,  the  Iteration  process  described  In  section  2  always  used  the 
least  squares  values  as  Initial  estimates  for  the  regression  parameters. 
The  problem  of  choice  of  initial  estimates  Is  considered  In  detail  In 


section  3c. 
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In  the  first  four  lines  of  the  table,  the  only  variation  Is  In 
the  small  positive  constant  replacing  the  ANOVA  estimates  of  T2 
and  which  were  negative.  Note  that  multiplication  of  those  prior 
variances  by  a  factor  of  10  leads  to. posterior  modal  estimates  In 
which  the  standard  deviations  are  about  10  times  as  large;  there  Is 
little  effect  on  the  means  of  ^2  and  or  on  the  other  parameters. 

In  the  next  3  lines  of  the  table  we  have  used  some  priors  that  some¬ 
body  vaguely  familiar  with  regression  equations  for  AOT  scores  slight 
have  specified.  Note  that  the  data  do  no  longer  show  the  almost  total 
regression  of  the  62  ^4  values  previously  Imposed  by  the  very  small 

prior  T2  and 

In  the  eighth  line  we  have  purposively  made  and  smaller 
than  T2  and  x^:  the  standard  deviations  of  the  modal  estimates  faith¬ 
fully  reflect  this  prior  pseudo-information,  although  the  model  II 
ANOVA  estimates  were  supposed  to  tell  us  that  the  data  suggest  total 
regression  for  the  slopes  pertaining  to  the  second  and  fourth  predictor, 
not  the  first  and  third.  The  ninth  line  shows  that  large  prior  variances 
produce  a  solution  very  close  to  the  LS  values.  The  final  three  lines 
give  the  characteristics  of  the  LS  estimates,  the  model  II  estimates  and 
the  regression  coefficients  when  data  from  all  groups  are  pooled  into 
one  sssiple. 

Finally,  note  that  Table  2  contains  a  line  marked  "II  but  twice  Xj^", 
in  which  the  only  change  compared  to  the  top  line  is  doubling  the  value 
of  The  fact  that  the  a  priori  most  probable  value  of  just  one  of  the 
slope  variances  now  Is  twice  as  large,  l.e.,  the  standard  deviation 


is  multiplied  by  1.414.  makes  the  standard  deviation  of  6^^  1.464 
times  as  large,  but  at  the  some  time  decreases  the  standard  deviation 

4 

of  10  6^^  from  23  to  18.  Looking  at  the  modal  estimates  of  the  regression 
parameters  themselves,  the  prediction  for  the  third  group  changes  most: 

it  was:  -.431  +  .018  X,  +  .017  X-  +  .015  X,  +  .017  X, 

12  3  4 

it  becomes:  -.401  +  .013  X,  +  .017  X.  +  .016  X,  +  .017  X, 

12  3  4 

What  conclusions  can  be  drawn  from  this  detailed  presentation? 

As  long  as  the  amount  of  variability  among  regression  coefficients  is 
small,  the  variability  of  the  Bayesian  posterior  modal  estimates  is 
strongly  Influenced  by  the  prior  specification;  it  was  already  noted 
by  Noviek  et  sl«  (1972)  that  the  small  positive  constant  replacing 
negative  model  11  ANOVA  estimates  should  be  chosen  with  some  care. 

The  means  across  groups,  on  the  other  hand,  are  rather  stable  in  Table  2, 
and  it  should  be  kept  in  mind  that  a  standard  deviation  of  .026  or  of 
2.5  around  a  mean  of  176  leads  to  almost  the  same  prediction  equations. 

The  quality  of  multiple  regression  equations  in  cross-validations  Is 
remarkably  stable  against  changes  in  regression  weights  (Dawes.  1978. 
Waincr,  1976),  so  the  differences  In  Table  2  may  after  all  not  be 
disastrous.  On  the  other  hand,  in  many  cross-validation  studies  Bayesian 
estimates  are  superior  only  by  a  few  percent  to  least  squares  per  group, 
so  a  careful  prior  specification  remains  Important.  We  shall  resume 
this  discussion  in  section  6,  where  the  revised  model  will  be  similarly 


examined . 


3c.  Almost  total  regression;  a  threat  to  the  model 

It  is  well  known  chat  complete  equality  of  parameters  across 
groups  leads  to  problems  In  Bayesian  simultaneous  estimation  (Novick, 
Jackson  &  Tliayer,  1971;  Llndley,  1971;  Novick  ct  al.,  1972;  Novick, 

Lewis  &  Jackson,  1973).  By  the  introduction  of  informative  priors, 

Lindley,  Novick  and  others  have  cried  to  avoid  the  degeneracy  problems. 

Tills  was  satisfactory  in  the  case  of  the  residual  variances  in  m-group 
regression,  discussed  in  section  4a.  For  Che  slopes  and  the  intercept, 
however,  it  does  not  help  enough.  This  will  be  Illustrated  first  by 
examining  the  log  posterior  density,  and  then  by  a  numerical  example. 

The  main  feature  of  our  new  model,  then  introduced  in  section  4b,  was 
motivated  by  the  desire  to  get  rid  of  the  degeneracy  problem. 

Let  us  now  examine  the  effect  of  almost  total  regression  for  a 
parameter  on  the  log  posterior  density  (1)  which  was  given  on  page  5. 

It  is  obvious  chat  the  first  line  of  (1)  would  be  maximized  by  the 
least  squares  (LS)  values.  The  second  line  is  maximized  by  bringing  the 
determinant  as  close  to  zero  as  possible.  When  the  user  has  supplied 
some  small  values  for  v’oij),  fills  is  achieved  by  linear  dependence 
among  Che  m-veccors  0^  (h  ■  0,  1,  ...  i).  Now  as  soon  as  the  estimated 
values  of  for  some  h  lie  very  close  together  (almost  total  regression), 
a  change  in  their  deviations  from  the  mean  0^^  has  almost  no  further 
influence  on  the  residual  sum  of  squares  in  the  first  line  of  (1), 
and  thus  it  is  used  to  make  the  determinant  decrease.  In  other  words. 


it  pays  to  let  the  (t  -t*  .1)-  variate  normal  distribution  of  the  0^ 
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degenerate  into  a  lower-dimensional  one.  Although  the  positive  value 

of  prevents  complete  degeneration,  the  algorithm  based  on  the 

old  model  is  deficient:  because  of  the  group-by- group  calculation  of 

new  a  change  in  luis  far  more  effect  on  the  log  posterior 

density  than  a  change  in  the  mean  0.  ,  and  the  optimal  value  for  0.  is 

He  n  • 

never  found  for  indices  h  with  small  variance  across  groups. 

Table  3a  shows  that  such  undesirable  behavior  was  indeed  found 
for  the  "12H0M0"  dataset  used  before.  In  each  block  .of  lines  of  this 
table,  the  same  prior  specification  was  combined  with  various  initial 
values, described  in  Table  3b.  Note  that  especially  in  block  1  suboptimal 
convergence  occurs  for  LS,  LSK  or  MD2  Initial  values;  the  log  posterior 
density  remains  at  what  seems  to  be  a  local  maximum,  and  the  maximising 
values  of  0^^  thus  obtained  differ  markedly  from  those  found  with  PLD 
Initial  values.  Although  Table  3b  shows  that  LSM  and  MD2  are  quite 
different,  they  lead  to  virtually  the  same  modal  solution  in  both  blocks 
of  Table  3a;  the  solution  from  LS  initial  values  is  worse,  and  from 
FLD  it  is  better.  Similar  results  were  found  for  other  datasets  than 
"12HOMO''. 


Table  3a.  Comparlslon  of  log  posterior  density  and  modal  Bayesian 
estimates  at  the  end  of  the  iteration  process  for  four  sets  of  initial 
estimates  described  in  Table  3b  ("12H0M0''  dataset) .  Within  each  block 
the  same  prior  specification  is  used  and  thus  the  final  log  post.  d. 
and  estimates  should  be  identical,  apart  from  rounding  errors.  The 
algorithm  was  programmed  to  stop  when  the  criterion  remained  constant 
in  five  significant  digits.  Instead  of  all  12  parameter  estimates 
per  group,  their  mean  and  standard  deviation  are  given.  The  intercept 
as  given  here  pertains  to  "ideal  scaling",  see  Novick  et  al.  (1972,  p.  37). 


Block 

1 :  v'  =  1  and  prior  model 

II  with 

and  (negative)  replaced  by 

to--' 

3 

4 

4 

4 

4 

3 

initial  log  post.d. 

10  fio 

10  Bi 

10  B2 

10  B3 

10  B4 

10  ^ 

M  (SO) 

M  (SD) 

M  (SD) 

M  (SD) 

M  (SD) 

M  (SD) 

LS 

280.76 

-  91(202) 

310(  68) 

174(.017) 

189(  23) 

173(.026) 

399(4.3) 

LSM 

281.21 

-103(201) 

322(  69) 

163(.017) 

203(  23) 

150(.026) 

399(3.9) 

PLD 

282.02 

-  62(201) 

309 (  69) 

157(.017) 

181 (  24) 

201 (.025) 

399(3.5) 

M02 

281.08 

-103(201) 

321 (  69) 

163(.017) 

203(  23) 

150(.026) 

399(4.0) 

Block 

2:  v'  «  1  and  prior  model 

It  with  x^ 

-4 

and  (negative)  replaced  by  10 

3 

4 

4 

4 

4 

3 

initial  log  post.d. 

10  Bo 

10  Bi 

10  B2 

10  63 

10  Bi, 

10  ^ 

M  (SD) 

M  (SD) 

M  (SD) 

M  (SD) 

M  (SD) 

M  (SD) 

LS 

199.27 

-85(192) 

307 (  68) 

170(17) 

190(26) 

179(24) 

395(4.2) 

LSM 

199.85 

-83(192) 

307 (  68) 

168(17) 

191(26) 

180(24) 

395(3.8) 

PLO 

200. 57 

-89(192) 

307 (  68) 

171(17) 

192(26) 

175(24) 

396(3.4) 

M02 

199.72 

-85(192) 

306 (  68) 

169(17) 

191(26) 

179(24) 

395(3.9) 

Table 

3b.  Four  sets 

of  initial  estimates  for  the  ’’12HOMO"  dataset. 

initial  estimates 

lO^Bo 

10“  Bi 

10“  62 

10**  63 

1o'*64 

lO^d 

N  (SO) 

M  (SO) 

N  (SD) 

M  (SD) 

M  (SD) 

M  (SD) 

LS 

-123(263) 

247(349) 

163 (  160) 

304(301) 

150(  200) 

442(129) 

LSM 

-123(  0} 

247(  0) 

163 (  0} 

304 (  0) 

150(  0) 

442 (  0) 

PLO 

-  70(  0) 

300 (  0) 

157(  0) 

141  (  0)  201  (  0} 

497 (  0) 

MD2 

-  95(270) 

251 (  91) 

163(.036) 

284 (  67) 

150(.037) 

441 (  59) 

Explanation:  LS  are  the  least  squares  estimates,  LSH  is  their  mean 
across  groups,  PLD  the  pooled  estimates  taking  all  individuals  from 
all  groups  together,  MD2  are  the  Model  II  ANOVA  estimates. 


In  a  trial  and  error  procedure  not  reported  in  Table  3,  we  have 
modified  the  PLD  set  of  initial  values  with  regard  to  3^^  and  3^^, 
the  two  sets  of  parameters  which  arc  almost  totally  regressed  in  Block  1. 
The  final  means  across  groups  for  the  two  sets  of  estimates  are 
essentially  identical  with  the  initial  values  thus  modified.  One 
such  modification  even  gives  a  slightly  larger  log  posterior  density 
than  that  baaed  on  PLD. 

.Several  other  trials  have  convinced  us  that  the  sensitivity  to 
initial  values  specification  is  not  something  very  exceptional,  and  that 
it  seems  to  be  most  pronounced  when  some  prior  variances  are  specified 
to  be  very  small.  The  initial  values  for  such  a  parameter  then  have 
a  mean  which  remains  almost  unchanged  during  the  iterations,  even 
though  a  change  could  produce  a  higher  value  of  the  log  posterior 
density.  This  is  because  the  algorithm  adapts  one  3|^^  at  a  time: 
moving  it  away  from  the  slope  values  in  the  other  groups  is  immediately 
punished  by  a  decrease  due  to  the  determinant  in  (1) .  Our  proposal 
in  the  next  section  to  take  3^^  equal  across  groups  for  certain  values 
of  h  Is  expected  to  bypass  this  undesirable  property  of  the  present 
algorithm. 

4.  Revised  model  assumptions 

The  problems  and  deficiencies  described  above  have  led  the  authors 
to  provide  a  revised  model,  which  was  schematically  described  in  Table  1. 
As  the  algorithm  based  on  the  new  model  is  Intended  for  the  CADA  Monitor 
and  will  be  regularly  used  on  medium  sise  computera,  it  was  decided  to 


introduce  a  few  more  simplifications.  The  subsections  4a  through  4e 
comment  on  those  changes;  the  model  itself  and  its  consequences  will 
be  described  in  section  5. 

4a.  Constant  residual  variance 

In  a  theory  of  Bayesian  m-group  regression,  the  groups  are 
considered  to  be  exchangeable,  but  to  have  varying  intercepts,  slopes 
and  residual  variances.  A  strictly  common  value  for  the  latter  is 
explicitly  forbidden  because  it  would  lead  to  divergence  problems. 

A  small  constant  k  is  introduced  in  the  formulae  involving  the  geometric 
and  harmonic  mean  for  just  this  reason.  When  the  value  of  <  was  varied 
between  .01  and  .0001  times  the  harmonic  mean,  this  had  some  influence 
on  the  across  groups  variability  of  the  estimated  residual  variances, 
the  modal  estimates  of  slopes  and  intercepts,  however,  remained  very 
stable. 

We  have  no  reason  to  believe  that  homoscedasticity  across  groups 

is  a  more,  or  less,  realistic  assumption  than  homoscedasticity  within 

groups.  Moreover,  in  all  examples  of  Bayesian  m-group  regression  that 

we  have  seen  the  coefficient  of  variation  of  the  final  Bayesian 

estimates  of  4^  <11^  not  exceed  1  or  2  per  cent.  Finally,  it  is  found 

both  in  the  algebraic  formulae  and  in  the  empirical  results  that  the 

Bayesian  estimates  of  (irttich  are  the  main  goal)  are  hardly 

0 

affected  at  all  when  small  or  moderate  differences  between  4^  across 
groups  are  ignored. 

In  the  model  outlined  in  section  5,  we  shall  thus  assume  that 
each  observation  has  the  same  residual  variance  4,  which  has  Itself 
a  noninformntive  prior  proportional  to  4  The  latter  assumption 
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could  be  replaced  by  an  inverse  chi  square  specifying  prior  knowledge 
on  The  data  provide  us,  however,  with  so  much  Infornation  on  ^ 
that  such  prior  information  will  not  be  important. 

In  remark  6  of  section  6  of  Molenaar  (1978)  a  warning  was  given 
for  a  perfect  or  almost  perfect  fit  in  at  least  one  group.  Division 
by  an  estimated  residual  variance  of  zero,  or  very  close  to  zero, 
could  of  course  create  problems.  Now  that  a  common  value  across  groups 
is  used,  the  risk  of  too  small  values  for  this  residual  variance  has 
become  negligible,  and  the  previous  use  of  a  lower  bound  PHIMIN  for 
residual  variances  has  not  been  continued. 

4b.  Common  values  in  case  of  low  variance 

It  has  been  documented  in  section  3c  that  the  algorithm  does  not 
perform  well  as  soon  as  some  regression  coefficient  shows  very  little 
variance  across  groups.  The  lack  of  variance  may  be  obtained  because 
its  prior  estimate  is  very  small  (the  actual  model  II  estimate  might 
be  negative,  in  which  case  Jones  and  Hovick  suggest  replacement  by 
10  ^).  It  may  also  happen  that  the  values  for  some  parameter  get 
very  close  together  during  the  iteration  process,  although  both  the 
prior  variance  estimate  and  the  initial  values  do  not  Indicate  this 
behavior. 

In  both  cases, a  variance  of  less  than  a  user-specified  bound 
TAUMIN  is  a  reason  for  replacing  all  values  8^^  (1  1,  2,  . . . ,  m) 

by  their  mean  ;  It  will  no  longer  be  assumed  that  such  a  parameter 
is  distributed  across  groups  as  a  component  of  the  multivariate  normal 
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(p,  H  distribution  mentioned  in  section  2,  but  that  it  has  a 
common  value  which  has  a  uniform  prior  distribution.  Because 
slopes  and  Intercepts  can  be  very  different  according  to  the  scales 
being  used,  the  bound  TAUMIN  is  applied  after  standardization  of 
all  variables,  see  subsection  4e. 

In  the  model  as  described  below,  it  is  assumed  that  the  index 
set  {0,  1,  2,  .,  1}  denoting  the  intercept  and  the  t  predictors  is 
subdivided  into  a  sec  F  (mnemonic  for  fixed)  for  which  this  total 
regression  has  taken  place,  and  its  complement  G  (mnemonic  for 
general)  for  which  the  values  across  the  groups  are  different.  The 
predicted  value  for  the  j-th  element  of  the  1-th  group  can  thus  be 
written  as 

^ij  "  fcF  ^f  *fij  geG  ®gi  *gij 
The  (i  +  1)-  dimensional  multinormal  distribution  of  B^^^ 

(h  *  0,  1,  ...,  i)  for  which  some  components  have  a  variance  very 

close  to  zero  will  thus  be  replaced  by  a  vector  of  which  some  components 

are  common  to  all  groups,  whereas  the  other  components  have  a  normal 

distribution  of  lower  dlmenslonsllty.  The  actual  effect  on  prediction 

of  this  replacement  is  negligible  if  the  variance  bound  TAUMIN  for 

admission  to  the  index  set  F  is  kept  low  enough. 

One  full  cycle  of  the  iteration  process  now  consists  of  four 

parts,  (see  also  section  5): 

(a)  solution  of  (B^  |CcF}  by  LS  regression  of 

^ij  “  gcG  ®gi*gij  ^*fij^’  (Pji)  as  known} 


i 
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(b)  solution  of  solving  a  system  of  linear  equations 

which  results  from  equating  the  derivatives  of  the  log 

m 

posterior  density  to  zero,  treating  {6-}  and{.E,(6  )^} 

£  1*1  gi  g. 

as  known; 

(c)  solution  of  treating  all  and  as  known; 

(d)  check  whether  any  Index  should  pass  from  G  to  F. 

The  split  of  all  Indices  into  the  subsets  F  and  G  essentially 
means  that  the  revised  model  is  really  used  as  a  clas.s  of  models,  or 
rather  a  lattice  consisting  of  2"^^  models  because  that  Is  the  number 
of  partitions  of  {0,  1,  Z}.  An  example  Is  given  in  Figure  1. 


The  constant  parameters  at  the  beginning  of  the  Iterations  are 
those  with  prior  variance  estimates  less  than  TAUMIN,  which  Is  set  at 
10  ^  In  the  current  version.  The  user  may  force  this  by  supplying 
zero  entries  in  subjective  prior  estimates, or  the  data  may  force  it  if ' 
model  II  prior  variance  estimates  are  used  and  these  come  out  less  than 


22 


TAUMIN  or  even  negative.  During  the  Iteration  process,  more  indices 
may  pass  into  F.  The  bottom  model  in  Figure  1  with  an  empty  G  auto¬ 
matically  produces  the  pooled  estimates.  The  model  used  by  Shigemasu 
(1976)  is  the  special  case  with  G  =  {0}:  free  intercept  and  constant 
slopes  were  postulated  by  Shigemasu,  but  are  just  one  of  the  many 
possible  models  here. 


Ind^endent  priors  for  regression  coefficients 


The  original  model  outlined  in  section  1  contain^  a  multivariate 
normal  (p,  H  distribution  for  and  for  H  a  Wishart  (v',  Z,  1+  1) 

distribution.  Earlier  publications  recommend  to  take  v'  •  1,  *  0 

for  h  j*  k  and  three  times  a  suitable  prior  estimate  of  the  variance 
of  (including  the  Intercept  as 

As  was  explained  above,  the  revised  model  allows  that  some  3.  .  have 

ill 

a  coinnon  value  6j^,  for  which  a  uniform  prior  is  assumed.  For  the  remaining 
parameters,  say  it  was  decided  to  replace  the  assumption  *  0 

by  the  slightly  stronger  assumption  that  H  ^  Itself  has  zero  off-diagonal 
values.  Our  new  model  then  becomes: 


Sj  ^  uniform  (-«,  •») ; 

6giA»  Cj):  J 


uniform  (-' 

(v'. 


(-•,  *); 

v’t  )  ^ 


all  and  Bgj^  independent  given  p^  and 


all  p  and  i|)  independent  given  v*  and  t 
B  g  o 
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The  modification ' leads  to  a  substantial  simplification  of  the 

algorithm.  Although  prior  knowledge  on  covariances  between  parameters 

Is  conceivable,  it  will  rarely  be  substantial,  and  the  revised  model 

of  course  permits  such  covariances  In  the  posterior  distribution. 

We  are  not  advocating  the  use  of  a  factor  3  In  multiplying  a  most 

plausible  value  for  the  variance  to  find  the  prior  value  for  tg*  It 

-2  2 

was  used  In  the  old  model  because  the  mode  of  any  X  (v*,  v'o  )  dlstri- 
2 

butlon  is  v'o  /(v'  +  2),  which  means  for  v'  =  1  equating  the  mode  to 

2  2 
a  /3,  and  taking  three  times  the  mode  for  o  .  This  argument  fails  to 

take  Into  account  that  the  natural  way  to  think  about  a  variance  (now 

called  of  a  regression  parameter  Is  In  the  logarithmic  scale 

(that  is  why  the  uniform  distribution  for  log  would  be  used  as  an 

_2 

Ignorance  prior).  But  if  X  (v’.v'tj;^)  then  the  density  of  w  »  log 

can  be  derived  to  be  prooortional  to  exp{-Vsv'  (w  +  x.e  )}  and  this 

*  n 

has  Its  mode  at  w  *■  log  T^.  An  extra  advantage  Is  that  the  mode  of 
the  log  standard  deviation  is  now  the  corresponding  log  logr^. 

When  the  user  Is  asked  for  a  "most  probable  value"  of  the  standard 
deviation  of  the  true  regression  coefficients  across  groups,  we  prefer 
to  use  this  value  as  a  specification  of  t.  .  * 

4d.  Leaps  for  the  mean  only 

This  change  has  already  been  motivated  and  discussed  In  subsection  3a. 
4e.  Standardization  of  variables 

Standardizing  all  predictors  and  the  criterion  to  zero  mean  and  unit 
variance  In  the  pooled  sample  means  that  the  predictors  get  the  common 
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and  absolute  scale  of  beta  weights  and  that  the  Intercepts  are  prevented 
from  assuming  very  large  absolute  values.  This  Is  better  for  the 
numerical  accuracy,  and  it  makes  it  possible  to  use  a  fixed  quantity 
(currently  10  for  the  minimum  variance  TAI^MIN  below  which  an  index 
is  passed  to  the  set  F  and  the  corresponding  parameter  is  assumed 
constant  across  groups.  It  is  obvious  that  this  would  be  undesirable 
for  raw  slopes,  of  which  one  could  range  from  .0004  to  .0008  and  another 
from  4000  to  8000,  say.  At  the  end,  just  before  the  modal  estimates 
are  printed,  a  reconversion  to  the  raw  scales  is  made,  but  the  intercept 
at  the  grand  mean  is  printed  as  an  extra  column  because  it  might  be 
more  meaningful  than  the  intercept  for  all  predictors  zero.  The 
desirability  of  the  standardization  was  pointed  out  earlier  in  remark 
6  of  section  6  of  Molenaar  (1978). 

In  earlier  publications  by  Novlck  et  al.  it  was  advocated  to  scale 
predictors  at  so  called  "ideal  scaling  points"  for  which  the  least 
squares  estimates  for  the  intercept  and  that  predictor  were  uncorrelated 
across  groups.  Calculation  of  these  ideal  scaling  points  was  one  of 
the  tasks  of  the  preparatory  program  "BPREP"  by  Thayer.  Our  reasons 
for  preferring  the  grand  means,  also  mentioned  by  Hovick  et‘  al.  as  an 
alternative  to  ideal  scaling  points,  are  the  following:  (a)  they  are 
easier  to  obtain;  (b)  the  intercept  at  the  grand  mean  Is  more  meaningful 
to  Che  user  than  the  intercept  at  some  "ideal  point"  that  he  never  met 
before;  (c)  uncorr elatedness  of  the  LS  estimates  is  not  the  same  as 
Che  (intended)  uncorrelatedness  of  the  true  parameter  values;  and 


(d)  empirical  evidence  both  from  Novlck  and  from  us  strongly  suggests 
that  the  choice  has  a  negligible  Influence  on  the  final  results. 


5.  The  new  model 

The  three  stages  of  the  new  model  have  been  described  in  Table  1, 
and  the  modified  assumptions  underlying  it  were  discussed  In  section  4. 
The  Joint  posterior  density  of  all  parameters  given  the  data  Is  for 
the  new  model 


P({6gi.  Bf.Ug.  »l»g}»*|(*hij* 


1  j 


ij  rrfij  Tei  gij 


.) 


(2) 


I  ♦.  «p[-  -5^  {v't  .  J(B  i-u  )^)I: 


«  g 


g 


here  n  ■  ^n^  denotes  total  sample  size,  and  it  Is  understood  that  in 

all  summations  1  ranges  from  1  to  m,  and  j  across  the  n^  individuals 
of  the  1-th  group;  moreover  fey  and  gtG,  the  Index  sets  of  the  constant 
and  free  parameters  respectively,  and  all  values  are  Identically 

1  as  dummies  for  the  Intercept. 

Noting  chat 


5«gr>‘g>  ■  f<“gi  -®g.>'  ♦  -<6g.  -“g)' 


(3) 


one  integrates  (2)  with  respect  to  each  p  ,'and  the  last  line  of  (2) 

o 


becomes 


5  e*p[-  ^  {v’T.  +  r.  )h] 


8  g 


gi  "g.‘ 


(4) 
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Next,  integration  with  respect  to  turns  this  product  into 

n  {v't  +£  (6  a 
g  g  i  gi  g- 

The  logarithm  of  the  posterior  density  Is  thus,  up  to  an  additive 
constant,  and  omitting  the  dependence  on  the  data  in  the  left  hand 
side! 

log  p  ♦)  ■  -’i(n  +  2)  log  ^  + 

-  w  5  •  I  “f  ■‘fij  •  I 

-  Ji(m  +  v'  -  1  )  E  log  {v’t  +  E  (3  .  -  6^  )^}.  (6) 

g  g  i  gi  g- 

It  is  instructive  to  compare  (6)  to  (1).  The  first  tern  is  simplified 
because  of  moreover  there  is  no  final  term  involving  geometric 

and  harmonic  means  of  Denoting  the  middle  term  as  -  Q  (3), 

it  is  clear  that  the  modal  estimate  for  ^  is  **  Q($)/(n  +  2),  and 
log  p  Sf},  $)  -  -*s(n  +  2)  log  Q  (3)  + 

+  >i(n  +  2)  log  (n  +  2)  -  4(n  +  2)  + 

-  >i(m  +  v’  -  1)  E  log  {v't  +  E  (3  .  -  3„ 

g  g  i  gl  g*  v/1 

Tills  makes  clear  the  compromise  character  of  the  modal  estimates 

for  3«  The  first  term  of  (7)  would  be  maximized  by  minimizing  Q  (3), 

that  is  by  using  the  least  squares  estimates.  The  last  term  is 

maximized  when  3  .  ■  3  for  each  i,  but  when  the  variance  is  less 
gi  g. 

than  the  bound  TAUMIN,  the  index  passes  into  the  set  F,  and  we  would 
end  using  the  pooled  estimates.  The  point  is  further  elaborated 
below. 

Differentiation  of  (6)  with  respect  to  one  fixed  3.  (seF)  yields 

8 

J  j  **iJ  “  g  i  ®gl  J  *8lJ  **1J  ^  f  1  J 


If  the  index  set  F  contains  n^  elements  and  are  treated  as 

knownt (8)  consists  of  n.  linear  equations  (seF)  in  n_  unknowns 

r  r 

(B^lfeF). 


Differentiation  of  (6)  with  respect  to  one  free  parameter  (tcG, 
uc{l,  2,  m>)  yields 

^  ^{E  y,x.-E6-  Ex-,x.  .-Eg  Ex  4X^.)+ 
j  uj  tuj  f  ^  j  g  J 


+  (m  +  v’  -  1)  -  Bj.  )/{v’T|.  +  E  (g^.^  -  B^  T)  =  0 


(9) 


Treating  ^,{6,|fcF},  &  and  the  expression  In  the  denominator  as  known, 
this  Is  a  set  of  1  +  1  -  linear  equations.  Indexed  by  t,  in  (1  +  1 
-  Op)  unknown  g^^  (for  gcG,  u  fixed). 

The  solution  for  given  all  and  ^B^^)  has  already  been 
mentioned  Just  before  (7).  As  announced  In  section  4b,  each  cycle 
of  the  Iteration  now  consists  of  such  a  successive  solution  of  all 
(Bp)  from  (0),  all  from  (9)  and  of  ^  from  ^  »  Q(6)/(n+  2). 

2 

It  is  followed  by  a  check,  for  each  index  g  eG,  whether  E  (g  .-g  )  /(m-1) 

1  gi  8* 

<  TAUMIN;  If  this  is  so  the  index  passes  from  G  to  F.  This  check 
Is  not  made  after  the  first  cycle,  because  the  values  obtained  there 
could  still  be  too  far  from  the  true  values  to  Justify  the  fixing  of 
the  parameters.  Before  the  iterations  begin,  however,  it  is  checked 
whether  some  of  the  prior  variances  (model  II  or  user-specified)  are 
below  TAUMIN,  and  if  so  the  corresponding  parameters  are  taken  constant 


■  across  groups. 


This  section  Is  mainly  written  for  the  benefit  of  the  user  of  the 
interactive  m-group  regression  program  which  was  a  result  of  the  research 
project  described  in  tills  report.  One  may  wonder  why  so  many  improved 
models  and  computer  programs  were  produced  since  the  publication  of 
the  basic  research  between  1969  and  1972.  Let  us  try  to  give  an 
indication  why  Bayesian  simultaneous  regression  estimation  in  m-groups 
is  a  complicated  matter,  even  compared  to  similar  m-group  models  for 
means  or  proportions. 

The  Bayesian  estimates  can  always  be  viewed  as  a  compromise  between 
least  squares  values  and  pooled  values.  Unless  one  of  these  extremes 
is  compatible  with  both  the  data  and  the  prior  information,  however,  the 
simultaneous  presence  of  an  intercept  and  i  predictors  poses  an  extra 
problem.  Kelley  could  write  -  pX^  +  (1  -p)  X. ,  and  the  reliability 
determines  the  extent  to  which  regression  to  the  mean  occurs.  In  our 
regression  model,  however,  this  extent  will  typically  differ  from  para¬ 
meter  to  parameter.  Not  only  do  we  have  t  -f  1  different  extents  of 
regression,  but  also  each  extent,  and  the  best  value  to  regress  to, 
are  influenced  by  the  decisions  on  the  other  extents  (cf.  Jackson,  1972, 
p. 224).  And  finally,  when  the  extent  was  a  reliability  it  could  be 
estimated  by  one  of  the  standard  psychometric  methods,  but  slopes  and 
intercepts  are  not  observable  quantities,  and  this  is  an  extra  obstacle 
in  trying  to  split  their  variance  into  true  variance  and  error  variance. 
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The  program  now  used  for  Bayesian  m-group  regression  has  some 
predecessors.  In  1972,  the  FORTRAN  computer  programs  BPREP  and 
6AYREG  were  written  by  Thayer  and  others  and  described  In  Jones  and 
Novlck  (1972).  A  modified  program  MBREG,  replacing  BAYREG,  Is 
described  In  Molenaar  (1978);  the  preparatory  program  MPREP  proposed 
in  Chat  reference  was  never  written.  In  the  fall  of  1978  MBREG  was 
succeeded  by  BR,  again  by  Molenaar,  which  incorporates  nearly  all 
Che  changes  mencloned  In  the  present  report.  The  major  exception  Is 
Chat  BR  has  no  rescaling  of  predictors  and  criterion.  BR  asks  for  some 
preprocessing  of  data,  which  could  be  done  In  the  BASIC  program  described 
below,  or  in  BPREP;  Independent  use  after  different  preprocessing  is 
feasible.  An  input  description  of  BR  Is  added  as  Appendix  A. 

Lewis  Chen  turned  the  batch  programs  BPREP  and  BR  into  conversa¬ 
tional  programs  In  BASIC,  called  BRl  and  BR2  respectively,  and 
added  several  new  features.  David  Chuang  made  some  final  additions, 
giving  extra  flexibility  to  the  programs.  This  version  will  give  a 
description  of  the  program  in  that  stage,  reached  in  March  1979. 

The  program  starts  with  an  option  of  explanatory  text,  describing 
Chat  it  leads  to  Joint  modal  estimates  for  regression  coefficients  in 
■  similar  (exchangeable)  groups  in  cases  of  minimal  prior  knowledge. 

It  specifies  the  restrictions  (currently:  at  most  50  groups,  at  most  4 
predictors,  at  least  6  observations  per  group)  and  announces  the  types 
of  sufficient  statistics  per  group  that  can  be  used  for  data  entry. 

Data  entry  may  be  completely  via  the  keyboard,  in  a  well  dociaiented 


but  lengthy  sequence  of  questions  and  answers.  The  standard  option. 


however.  Is  data  entry  from  a  previously  prepared  file.  Such  a  file 
goes  in  the  current  version  under  the  local  name  "SOMOATA.''  It  may 
now  have  been  prepared  In  an  earlier  keyboard  entry  session,  but 
after  updating  of  the  CADA  Data  Management  capabilities  it  will  be 
possible  to  create  the  complete  input  file  there.  At  the  end  of  data 
entry,  either  file  or  keyboard,  facilities  for  input  revision  are 
offered. 

The  program  next  displays  Che  least  squares  (LS)  estimates  per 
group  for  the  intercept  at  zero,  intercept  at  the  pooled  mean  of 
the  predictors,  slopes  and  residual  standard  deviation.  It  is  ad¬ 
visable  Co  study  these  in  some  detail:  it  could  be  wise  to  delete 
a  group  or  split  Che  analysis  into  clusters  of  groups  if  the  LS 
values  indicate  a  strong  violation  of  exchangeability  or  of  homo- 
scedasCiciCy  between  groups.  It  should  be  kept  in  mind,  however,  that 
for  small  sample  sizes  the  LS  values  behave  rather  wildly,  and  that 
Che  estimated  residual  standard  deviations  may  differ  by  a  factor  of 
say  3  without  making  the  model  of  equal  s.d.  seriously  misleading. 

As  an  extra  line  below  Che  LS  values  of  the  last  group,  the  pooled 
values  (FID)  are  displayed,  which  would  be  obtained  by  pooling  the 
observations  from  all  groups  and  calculating  one  least  squares  regres¬ 
sion  equation  for  the  Joint  data.  The  Bayesian  estimates  that  the 
program  seeks  to  obtain  can  always  be  viewed  as  a  compromise  between 
the  extreme  situations  of  LS  (groups  have  nothing  to  do  with  each 
other)  and  PLD  (groups  are  samples  from  the  same  population). 


As  the  next  step  the  program  calculates  model  II  AMOVA  estimates 
for  the  variance  across  groups  of  the  regression  parameters,  and  the 
corresponding  standard  deviations.  The  calculation,  described  in 
Jackson  (1972,  p.  223-224)  amounts  to  subtracting  from  the  "observed" 
varlancu  of  the  l.S  estimates  the  "error"  variance  that  can  be  ascribed 
to  sampling  error.  It  Is  well  known  that  such  estimates  can  be  negative, 
in  which  case  Che  program  replaces  them  by  zero. 

For  the  intercept  this  part  of  the  program  assumes  all  predictors 
at  the  grand  mean,  which  is  shown  on  the  same  display.  It  is  obvious 
that  the  Intercept  with  all  the  predictors  at  zero  could  exhibit  much 
more  variability.  Criterion  values  for  the  predictors  at  the  grand 
mean  should  be  more  meaningful  for  the  user,  and  their  variability  is 
to  a  large  extent  independent  of  variability  in  the  slopes. 

At  this  stage  the  user  has  an  important  option;  he  may  delete 
some  predictor  (which  may  avoid  multlcollinearlty  problems)  or  some 
group  (which  may  avoid  violations  of  exchangeability  and/or  homo- 
scedastlclty) . 

When  a  satisfactory  set  of  predictors  and  groups  has  been  selected, 
thd  program  proceeds  to  specification  of  prior  information.  This  re¬ 
quires  first  prior  estimates  of  the  standard  deviations  across  groups 
of  the  regression  parameters.  The  user  may  choose  either  the  model  II 
estimates  or  provide  his  own  prior  information.  In  the  absence  of 
such  information  the  model  II  values  are  certainly  useful,  although 
they  have  Che  properties  of  (a)  making  the  prior  data  dependent  and 
(b)  ascribing  all  variance  to  sampling  error  whenever  the  estimate 
comes  out  negative,  thus  forcing  Che  corresponding  slope  or  intercept 
to  be  constant  across  groups.  Our  personal  feeling  is  that  there  ere 


situations  In  which  the  user  has  no  idea  about  true  between  group 
variability  (then  use  model  II)  and  also  situations  in  which  previous 
experience  with  similar  regression  problems  enables  the  user  to  guess, 
at  least  accurately  up  to  a  factor  of  three,  say,  the  prior  standard 
deviation  between  groups. 

Wlien  the  user  is  di)ul)trul  as  to  whctlier  these  prior  standard  de¬ 
viations  are  not  just  pure  guesswork,  we  have  two  consolations  for  him. 
First,  the  model  does  not  use  this  prior  value  as  such,  but  it  assumes 
that  the  true  prior  variance  has  an  inverse  chi  square  distribution 
with  low  degrees  of  freedom  around  the  square  of  the  supplied  value 
as  a  typical  one,  so  all  kinds  of  smaller  and  larger  variances  remain 
possible.  These  degrees  of  freedom  are  the  next  question  asked  by  the 
program:  the  recommended  range  is  1  through  10,  with  many  groups  a 
little  higher  than  with  few  groups.  For  most  cases  df*5  will  be  a 
reasonable  choice.  Secondly,  the  user  may  rerun  his  analysis  with 
different  prior  s.d.  or  df  and  find  out  for  himself  whether  his  results 
are  very  sensitive  to  his  subjective  decisions  (our  experience  is  that 
they  typically  are  not  essentially  influenced  unless  rather  little 
amounts  of  data  are  used.)  Note  that  the  final  values  of  log  posterior 
density  are  not  comparable  between  runs  with  different  prior  s.d.  or  df. 

A  last  choice  that  the  user  may  make  is  whether  he  wants  the 
iterations  to  start  from  LS  or  PLD  Initial  estimates.  It  is  advised 
to  use  PUD,  and  LS  only  in  cases  where  large  datasets  make  it  plausible 
that  the  end  results  will  be  close  to  LS.  This  option  is  useful  when 
the  existence  of  blmodality  is  feared:  if  convergence  from  both  ex¬ 
treme  initial  situations  leads  to  the  same  log  posterior  density  (up 
to  4  significant  digits)  and  the  same  slopes  and  intercepts  at  the  grand 
mean  (up  to  2  significant  digits)  the  risks  of  obtaining  a  local  maximum 


are  highly  reduced.  If  the  user  reruns  the  program,  after  obtaining 
Bayesian  estimates,  with  different  prior  s.d.  or  df,  it  is  also  pos¬ 
sible  to  use  the  earlier  Bayesian  estimates  as  initial  values.  This 
option  usually  leads  to  faster  convergence  chan  PLD  or  LS  initial 
estimates. 

Now— at  last — the  program  has  enough  information  to  start  the 
iterative  process.  In  each  cycle  several  systems  of  linear  equations 
have  to  be  solved  and  the  corresponding  sets  of  parameters  are  updated. 
As  this  may  be  time-consuming  on  a  medium-sized  or  small  computer,  the 
value  of  the  log  posterior  density  at  the  end  of  each  cycle  is  printed 
so  Chat  Che  user  may  follow  the  search  for  its  maximum.  After  the 
fifth  cycle  and  then  after  each  fourth  next  cycle  there  may  be  more 
increase  of  the  log  posterior  density  because  an  extrapolation  or  leap 
is  made.  The  iteration  scopsvwhen  the  log  posterior  density  is  stable 
in  five  significant  digits.  If  this  takes  more  than  10  cycles,  the 
user  may  exit  Che  iteration  process  after  each  set  of  10  cycles.  This 
facility  could  be  useful  when  a  restart  with  other  Initial  estimates  or 
prior  values  is  desired.  The  use  of  the  estimates  obtained  before 
stabilization  of  the  log  posterior  density  should  not  be  encouraged: 
it  is  a  very  flat  surface  as  a  function  of  its  many  parameters,  and 
small  changes  in  the  log  posterior  density  may  correpsond  to  substantial 
changes  in  the  slopes  and  intercepts. 

The  next  display  shows  the  modal  posterior  values  of  intercept  at 
zero,  slopes,  and  Intercept  at  grand  mean.  This  is  done  for  all  groups, 
or  for  10  groups  at  a  time  if  there  are  more  the  10.  Ac  the  bottom 
Che  modal  estimate  of  the  residual  variance  and  the  corresponding  stan¬ 
dard  deviation  are  given  (homoacedasticity  is  assumed  both  within  and 
between  groups). 


It  Is  obvious  that  the  user  will  want  to  keep  the  final  modal 
estimates.  In  many  cases  he  will  be  also  interested  to  keep  the 
prior  s.d.'s  and  df  and  Che  LS  and  PLD  estimates.  The  program  there** 
fore  opens  a  local  file  DATB,  in  which  these  quantities  are  entered 
in  fixed  format  for  later  use.  See  Appendix  6  for  a  full  description. 
This  file  should  be  printed  or  copied  before  the  next  run  of  Che 
program,  because  that  run  would  overwrite  it. 

7.  Conclusion,  possible  extensions 

The  new  feature  of  this  program  allowing  constant  parameters 
across  groups  upon  suggestion  of  either  the  user  or  the  data  seems 
to  be  a  satisfactory  solution  to  the  problems  of  almost-degeneracy 
encountered  before.  Together  with  the  extrapolation  of  iterations 
by  leaps,  it  permits  a  fast  and  stable  iterative  estimation  of  Che 
many  parameters  involved  in  simultaneous  multiple  regression.  The 
results  remain  somewhat  sensitive,  however,  to  different  prior  spec** 
iflcations.  Research  on  prior  elicitation  now  going  on  in  both 
Pittsburgh  and  Iowa  City,  may  assist  future  users  on  this  point. 

The  revised  program  and  model  are  now  ready  for  application, 
but  Che  authors  cannot  resist  the  temptation  to  mention  a  few  poss¬ 
ible  improvements. 

Prior  knowledge  on  means.  The  assumptions  of  uniform  distri¬ 
butions  for  the  parameters  6^  common  to  all  groups  and  for  the  means 
Ug  of  parameters  different  per  group  could  be  relaxed  to  allow 
Che  use  of  prior  information. 


Angles  for  slopes.  Normal  distributions  for  slopes  are  not  a 
very  realistic  model  unless  the  coefficient  of  variation  is  snail. 
Slopes  for  the  groups  that  are  normally  distributed  with  e.g.  a 
mean  of  3  and  a  standard  deviation  of  .2  are  acceptable,  but  not 
slopes  with  a  mean  of  3  and  a  standard  deviation  of  2:  the  slope 
change  from  1  to  3  is  certainly  more  drastic  than  from  3  to  5,  and 
even  more  when  we  compare  a  change  from  -1  to  3  to  a  change  from  3 
to  7.  Neither  uniform  priors  for  mean  slopes  nor  a  p'rior  for  the 
variance  of  a  slope  Independent  of  the  mean  seem  to  reflect  our 
belief  about  slopes.  Parameterization  in  terms  of  angles  rather  than 
slopes  does  away  with  most  of  these  problems  and  will  be  examined  in 
future  research.  It  is  not  a  serious  drawback  that  it  leads  to  non¬ 
conjugate  distributions.  As  Bayesian  modal  estimates  thus  far  have 
typically  shown  small  standard  deviations,  the  practical  impact  of 
using  angles  for  slopes  will  not  be  dramatic. 

LS  estimates  in  restricted  model.  The  model  II  estimates  are 
obtained  by  subtracting  sampling  variance  from  the  "observed  variance" 
of  the  LS  estimates.  Once  some  of  them  are  negative  and  the  corre¬ 
sponding  parameters  are  fixed,  one  could  recalculate  LS  estimates 
under  that  restriction:  common  values  for  some  parameters,  free 
values  for  the  others.  Such  a  set  of  restricted  LS  estimates  are 
useful  for  two  purposes:  they  would  be  a  better  set  of  initial 
estimates  for  the  Iteration,  and  the  model  II  variance  estimate  for 
the  still  free  parameters  among  them  is  a  better  value  for  the  prior 


variance,  because  Che  restriction  of  some  parameters  certainly  affects 
the  mean,  the  raw  variance,  and  the  sampling  error  of  the  others. 


The  data  deck  for  the  FORTSAf]  program  BR  consists  of  the  following 

cards: 

1.  Identification  Card  (10A8) 

Col.  1-80  Identification  for  data 

2.  Parameter  Card  (314,  E8.2,  F5.0,  8l2,  4F5.1) 

M  and  NV  must  be  read  In,  other  parameters  get  default  values 
If  blank 


col. 

name 

format 

1-4 

M 

I  4 

nund>er  of  groups  (jc  25) 

5-8 

NV 

I  4 

number  of  predictors  (£  4) 

9-12 

NCMX 

I  4 

maximum  number  of  cycles  (default  *  30 

Is  used  when  0;  numbers  exceeding  100  are 

replaced  by  100) 

13-19 

TAUMIN 

E  8.2 

if  prior  variance,  or  calculated  variance 

beyond  cycle  2,  is  less  chan  TAUMIN,  a  common 

value  across  groups  Is  used.  Default  -  10~^ 

Is  used  If  number  read  Is  less  than  10 

20-25 

PHIMIN 

F  5.0 

minimum  for  residual  variance  (default 

-3  ■  -7 

*  10  If  number  read  Is  less  than  10  ) 

not  used  In  this  version. 

26-27 

INIST 

I  2 

0*  >  LS  Initial  values 

1*  «  pooled  Initial  values 

2*  ■  model  II  Initial  values 

3  -  read  initial  values,  a  at  Ideal  point 


*  not  yet  available 


col.  name  format 

4  >  read  initial  values,  a  at  scaling  point 

5  read  initial  values,  a  at  origin. 

28-29  IWR  12  0*  =  no  details  on  iteration 

1  =  details  are  printed 

30-31  INTAU  12  0*  »  model  II  prior  variances 

1  e  read  prior  variances 

32-33  IPUN  12  0  *  no  punched  output 

1  *  modal  estimates  are  punched 
(8X,  6E  12.6) 

34-35  NDH  I  2  Iteration  stops  when  log  posterior  density 

constant  in  NDH  leading  digits  (default  ■  5) . 
36-37  NDB  12  No  leaps  are  taken  for  a  mean  constant 

in  NDB  leading  digits  (default  4) . 

38-39  NCI  I  2  Number  of  cycles  preceding  first  leap 

(default  -  5,  but  4  is  used  if  number 
read  £  4) . 

40-41  NCF  I  2  Number  of  cycles  between  leaps  (default 

*  4,  is  used  if  number  read  ^  4) . 

42-46  SCH  F  5.1  Leap  ■  SCH*  last  difference  if  difference 

has  Just  changed  sign  or  old  difference 
almost  0  (default  »  20.0). 

47-51  D<ai  F  5.1  Leap  ■  DCN*  last  difference  if  this  difference 

is  not  substantially  closer  to  0  Chan  previous 
difference  (default  ■  20.0); 

*Noc  yet  available. 


39 


col,  name  format 

52-56  VGT  F  5.1 

57-61  PNU  F  5.1 


Not  used  In  this  version. 

Dcf.rccs  oC  freedom  for  prior  variances 
(default  ■  1  is  used  when  number  read  is 
less  than  lO”^) . 


3.  Prior  Variance  Estimates  Card  (6E12.6) 

Col.  1-12  T^^varlance  estimate  for  intercept  (ideal  scaling) 

13-24  T^«varlance  estimate  for  coefficient  of  first  predictor 
. . .  (similarly  for  other  predictors) 

The  remaining  cards  are  read  from  a  local  file  "DATA",  not  from 
INPUT,  as  they  will  remain  the  same  for  various  analyses  of  the  same  dataset. 

4.  Predictor  Card  (4A8) 

Col.  1-8  Name  of  1st  predictor 

9-16  Name  of  2nd  predictor 


5.  Scaling  Card  for  Original  Scaling  (5F8.0) 

Points 

Col.  1-8  Value  to  which  criterion  has  been  scaled 

9-16  Value  to  which  predictor  1  has  been  scaled 

17-24  It  H  M  2  '* 


6.  Scaling  Card  for  Ideal  Points  (SE13.6) 

Col.  1-13  Value  to  which  criterion  has  been  scaled 

14-26  Ideal  scaling  point  for  predictor  1 

27“39  **  **  **  **  **  2 


7.  Format  Card  for  SCP  Matrix  (A8) 

The  cross  products  must  be  read  in  floating  point  form. 

8.  SCP  Matrix  Cards 

For  each  group,  there  must  be  an  upper  triangular  cross-product 
matrix  punched  according  to  the  format  specified  by  card  6.  The 
cross-product  matrices  have  the  following  form  for  the  case  of 

“ll/lj 

These  cross  products  are  scaled  to  the  values  given  by  card  5. 

9.  Initial  Values  Cards  (6E12.6} 

For  each  group  there  must  be  a  set  of  initial  values,  either 
produced  by  BPRBP  or  obtained  separately.  For  the  1^^  group, 


two  predictors: 


Row  1 
Row  2 
Row  3 
Row  4 


rx 


£x 


ilj 

2 

i2j 


^*ilj 

^*ilj’‘i2j 


Sx 


Col,  1-12 
13-24 


initial  value  for  6^^ 
initial  value  for 


Initial  value  for  ^  (must  appear  as 
the  last  entry  on  each  card) . 

As  mentioned  In  the  text,  one  possible  source  of  the  Information 
required  In  items  3  (Prior  Variance  Estimates) ,  6  (Ideal  Scaling 
Points),  and  9  (Initial  Values),  Is  the  FORTRAN  program  BPREP. 

The  Information  required  to  run  that  program  Is  given  by  Jones 
,  and  Novlck  (1972,  p.  24). 

The  program  BR  makes  use  of  the  IMSL  library  routine  LEQT  IF 
for  linear  equations.  This  routine,  or  a  similar  one,  should  thus 
be  available  during  execution,  as  should  be  the  local  file  "DATA" 
containing  items  4  through  9  listed  above. 


Description  of  the  local  file  "DATB"  on  which  the  program  writes 


results  Important  to  the  user: 

There  are  at  least  three  blocks  of  information.  Each  block  consists 
of  at  least  one  title  line,  followed  by  ro  lines  of  numbers.  These 
are  group  number.  Intercept  at  zero.  Intercept  at  pooled  mean,  and 
slopes  for  each  of  the  predictors.  The  FORTRAN  format  for  each  of 
these  lines  is  (13,  3X,  K  (F10.4)),  where  K  is  the  number  of  predictors 
plus  two.  Blocks  are  separated  by  a  blank  line. 

The  titles  for  the  blocks  are 

1.  PER  GROUP  LEAST  SQUARES  REGRESSION  WEIGHTS 
GROUP  INT(O)  INT(PM)  (predictor  names). 

2.  POOLED  LEAST  SQUARES  REGRESSION  WEIGHTS. 

3.  BAYESIAN  MODAL  REGRESSION  WEIGHTS 

PRIOR  PRIOR  SD 

DF  INT(PM)  (predictor  names) 

(value  of  v')  (values  of  Tg) 

4.  Same  as  3,  for  each  additional  Bayesian  analysis  after  the  first. 
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Naval  Ccean  Systems  Center 
Code  7132 

San  Diego,  CA  92152 

1  Dr.  Ronald  Vieltzntan 

Department  of  Administrative  Sciences 
U.  S.  (iav&l  Postgraduate  School 
Monterey,  CA  93940 

1  DR.  MARTIN  F.  WISKOFF 

NAVY  PERSONNEL  R&  D  CENTER 
SAN  DIEGO,  CA  92152 


Army 


1  Technical  Director 

U.  S.  Army  Research  Institute  for  the 
Bahrviorai  and  Social  Sciences 
5001  Eisenhower  Avenue 
Alexandria,  VA  22353 

1  HO  U3AREUE  i  7th  Army 
CDCSCPS 

USAAREUE  Director  of  GED 
APO  New  York  09403 

1  DR.  RALPH  DUSEK 

U.S.  ARMY  RESEARCH  INSTITUTE 
5001  EISENHOl/ER  AVENUE 
ALEXANDRIA,  VA  22333 

1  Dr.  Kyron  Fisc hi 

U.S.  Array  Research  Institute  for  the 
Social  and  Eehavioral  Sciences 
5001  Eisenhower  Avenue 
Alexandria,  VA  22333 

1  Dr.  Beatrice  J.  Farr 

Army  Research  Institute  (PERI-OK) 

5001  Eisentwwer  Avenue 
Alexandria,  VA  22333 

1  Dr.  Milt  Maier 

U.S.  ARMY  RESEARCH  INSTITUTE 
5001  EISENHONER  AVENUE 
ALEXANDRIA,  VA  22333 

1  Dr.  Harold  F.  O'Neil,  Jr. 

ATTN:  PERI-OK 

5001  EISENHOWER  AVENUE 

ALEXANDRIA,  VA  22333 

1  Or.  Robert  Ross 

U.S.  Army  Research  Institute  for  the 
Social  and  Behavioral  Sciences 
5001  Eisenhower  Avenue 
Alexandria,  VA  22333 

1  Dr.  Robert  Sasmor 

U.  S.  Army  Research  Institute  for  the 
Behavioral  and  Social  Sciences 
5001  Eisenhower  Avenue 
Alexandria,  VA  22333 
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Army 


1  Director,  Training  Oevelo^nient 
U.S.  Army  Administration  Center 
ATTN:  Dr.  Sherrill 
Ft.  Penjamin  Harrison,  IN  46213 

1  Dr.  Frederick  Steinheiser 
U.  S.  Army  Reserch  Institute 
5001  Eisenhower  Avenue 
Alexandria,  VA  22333 

1  Dr.  Joseph  llard 

U.S.  Army  Research  Institute 
5001  Eisenhower  Avenue 
Alexandria,  VA  22333 


Air  Force 


1  Air  Force  Human  Resources  Lab 
AFHRL/PED 

IVooks  AFE,  TX  75235 

1  Air  University  Library 
AUL/LSE  76/443 
faxwell  AFB,  AL  36112 

1  Dr.  Philip  De  Leo 
AFHRL/TT 

Lowry  AFB,  CC  80230 

1  OR.  G.  A.  ECKSTRAHD 
AFHRL/AS 

WRIGHT-PATTERSOtl  AFB,  OK  45433 

1  Dr.  Genevieve  Haddcd 
Progrom  1  onager 
Life  Sciences  Directorate 
AF03R 

Bolling  AFB,  DC  20332 

1  CDR.  MERCER 

CHET  LIAISON  OFFICER 
AFHRL/FLYING  TRAINING  DIV. 
WILLIAMS  APB,  AZ  85224 

1  Dr.  Ross  L.  Morgan  (AFHRL/ASR) 
Wright  -Patterson  AFB 
Ohio  45433 

1  Dr.  Roger  Pennell 
AFHRL/TT 

Lowry  AFB,  CO  80230 

1  Personnel  Analysis  Division 
HQ  USAF/DPXXA 
Washington ,  DC  20330 

1  Research  Branch 
AFMPC/DPMYP 

Randolph  AFB,  TX  76148 

1  Dr.  Malcolm  Ree 
AFHRL/PED 

Brooks  AFB,  TX  78235 
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Air  Force 


Marines 


1  Dr,  Marty  Rockway  (AFHRL/TT) 
Lovry  AFB 
Colorado  30230 

1  Jack  A.  Thorpe,  Capt,  USAF 
Program  Manager 
Life  Sciences  Directorate 
AFOSR 

Bolling  AFB.  DC  20332 


1  H,  William  Greenup 

FxJuuution  Advisor  (C031) 

Education  Center,  MCDEC 
Ouantico,  VA  221  S'* 

1  Director,  Office  of  Manpower  Utilization 
HCj,  Karine  Corps  (MPU) 
bCB,  Lldg.  2009 
Quantico,  VA  2213^ 


1  Brian  K.  Wcters,  LCOL,  USAF 
Air  University 
t^xwell  ArB 
Montgomery,  AL  3C112 


1  Dk.  A.L.  SLAFKOSKY 

SCIEiiTlFIC  ADVISOR  (CODE  RD-1) 
HQ,  U.S.  MARIHE  CORPS 
WASHINOTC!},  DC  20380 
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CoastGuard 


1  Mr.  Richard  Lanterman 

PSYCHOLOGICAL  RESEARCH  (G>l 
U.5.  COAST  GUARD  HQ 
WASHINGTON,  DC  20590 

1  Dr  .  Tiiomss  Wann 

U.  S.  Coast  Guard  Institut< 
P.  C.  Substation  18 
Oklahoma  City,  OK  73169 


Other  DoD 


-1/62) 


12  Defense  Documentation  Center 
Cameron  Station,  Bldg.  5 
Alexandria,  VA  22314 
Attn:  TC 

1  Dr.  Dexter  Fletcher 

ADVANCED  RESEARCH  PROJECTS  AGENCY 
1400  WILSON  BLVD. 

ARLINGTON,  VA  22209 

1  Dr.  William  Graham 
Testing  Directorate 
11EPC011 

Ft.  Sheridan,  IL  600 j7 

1  tlllitary  Assistant  for  Training  and 
Personnel  Technology 

Office  of  the  ^nder  Secretary  of  Defense 
for  Research  i  Engineering 
Room  3D  129*  The  Pentagon 
'Washington,  DC  2C301 

1  MAJOR  Wayne  Selbian,  USAF 

Office  of  the  Assistant  Secretary 
of  Defense  (MRA&L) 

3B930  The  Pentagon 
Washington,  DC  20301 
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Civil  Govt 


1  Dr.  Susan  Chipman 
B:j3ic  Skills  Program 
National  Institute  of  Education 
1200  19th  Street  KM 
Washington .  DC  20208 

1  Dr.  VJilliam  Gorham,  Director 
Personnel  R&D  Center 
Office  of  Personnel  Kanagaient 
1900  E  Street  KW 
Washington ,  DC  20415 

1  Dr.  Joseph  1.  Lipson 

Division  of  Science  Education 
Room  W-S3B 

National  Science  Foundation 
Washington,  DC  20550 

1  Dr.  John  Mays 

National  Institute  of  Education 
12C0  19th  Street  Nl/ 

Washington,  DC  20208 

1  Dr .  Arthur  Itelmed 

National  Intitute  of  Education 
1200  19th  Street  NM 
Viashington,  DC  20208 

1  Dr.  Andrew  R.  Fiolnar 
Science  Education  Dev. 
and  Research 

National  Science  Foundation 
Washington ,  DC  20550 

1  Dr.  Lalitha  P.  Sanathanan 

Environmental  Impact  Studies  Division 
Argonne  National  Laboratory 
9700  S.  Cass  Avenue 
Argonne,  IL  60439 

1  Dr.  Jeffrey  Schiller 

National  Institute  of  Education 
1200  19th  St.  NW 
Washington,  DC  20208 


Civil  Govt 


1  Dr.  TliomtiS  C.  Sticht 
Eusic  Skills  Program 
t^tional  Institute  of  Education 
1200  19th  Street 
Viashington,  DC  20203 

1  Dr.  Vern  Urry 

Persdhnol  R4D  Center 
Office  of  Personnel  Managment 
1900  E  Street  MW 
Washin|ton,  DC  20415 

1  Dr.  Frank  Withrow 

U.  S.  Office  of  Education 
400  5th  Street  SW 
Washington,  DC  20202 

1  Dr.  Joseph  L.  Young,  Director 
tiemory  A  Cognitive  Processes 
National  Science  Foundation 
Washington,  DC  20550 
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Non  Govt 


1  Dr.  Earl  A.  Alluisi 
HQ,  AFHRL  (AF3C) 

Drooks  AFB,  TX  73235 

1  Dr.  Erling  B.  Anderson 
University  of  Copenhagen 
Studiestraedt 
Copenhagen 
DENMARK 

7  1  psychological  research  unit 

Dept,  of  Defense  (Army  Office) 
Campbell  Park  Offices 
Canberra  ACT  2600,  Australia 

1  Dr.  Alan  Baddeley 

Medical  Research  Council 

Applied  Psychology  Unit 
15  Chaucer  Road 
Cambridge  CB2  2EF 
ENGUND 

1  Dr.  Isaac  Rejar 

Educational  Testing  Service 
Princeton,  NJ  03450 

1  Dr.  Warner  Birlce 
St r ei tkr ae f team t 
Rosenberg  5300 
Bonn,  West  Germany  D-5300 

1  Dr.  R.  Darrel  Bock 

Department  of  Education 
University  of  Chicago 
Chicago,  IL  6O637 

7  Dr.  Nicholas  A.  Bond 
Dept,  of  Psychology 
Sacramento  State  College 
600  Jay  Street 
Sacramento,  CA  95819 

7  Dr.  David  G.  Bowers 

Institute  for  Social  Research 
University  of  Michigan 
Ann  Arbor,  MI  48106 


tton  Govt 


1  Dr .  Robert  Brennan 

American  College  Testing  Programs 

P.  0.  Box  168 

Iowa  City,  lA  52240 

1  DR.  C.  VICTOR  BUHDERSOM 
WICAT  INC. 

UNIVERSITY  PLAZA,  SUITE  10 
1160  SO.  STATE  ST. 

OREM,  UT  34057 

1  Dr.  John  E.  Carroll 
Psychometric  Lab 
Univ.  of  No.  Carolina 
Davie  Hall  013A 
Chapel  Hill,  KC  27514 

1  Charles  H-yers  Library 
Livingstone  House 
Livingstone  Road 
Stratford 
London  E15  2LJ 
ENGLAND 

1  Dr.  John  Chiorini 
Li t ton-Me 1 Ion ic  s 
Box  1286 

Springfield,  VA  22151 

1  Dr.  Kenneth  E.  Clark 

College  of  Arts  &  Sciences 
University  of  Rochester 
River  Campus  Station 
Rochester,  NY  14627 

1  Or.  Norman  Cliff 
Dept,  of  Psyctology 
Univ.  of  So.  California 
University  Park 
Los  Angeles,  CA  90007 

1  Or.  William  Coffbian 
Iowa  Testing  Programs 
university  of  lows 
Iowa  City.  lA  52242 
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Non  Govt 


1  Dr.  Meredith  Crawford 

Department  of  Engineering  Administration 
George  Vteshington  University 
Suite  305 

2101  L  Street  K.  W. 

Washington,  DC  20037 

1  Or.  Hans  Cronbag 

Education  Research  Center 
University  of  Leyden 
Hoerhsavelaan  2 
Leyden 

The  NETHERLANDS 

1  Dr .  Emmanuel  Donchin 

Department  of  Psycliology 
University  of  Illinois 
Chompaign,  IL  61 820 

1  MAJOR  I.  M.  EVONIC 

CANADIAN  FORCES  PERS.  APPLIED  RESEARCH 
1107  AVENUE  ROAD 
TORONTO,  ONTARIO,  CANADA 

1  Dr .  Leonard  Feldt 

Lindquist  Center  for  Measurment 
University  of  Iowa 
Iowa  City,  lA  52242 

1  Dr.  Richard  L.  Ferguson 

The  American  College  Testing  Program 

P.O.  Box  168 

Iowa  City,  I A  52240 

1  Dr.  Victor  Fields 
Dept,  of  Psychology 
Montgomery  College 
Rockville,  ND  20850 

1  Dr.  Gerhardt  Fischer 
Liebigasse  5 
Vienna  1010 
Austria 

1  Dr.  Donald  Fitzgerald 
University  of  New  England 
Armidale,  New  South  Wales  2351 
AUSTRALIA 


1  Dr.  Edwin  A.  Fleishmen 

Advanced  Research  Resources  Organ. 
Suite  900 

4330  East  ’.Jest  Highway 
Washington,  DC  20014 

1  Dr.  John  K.  Frederiksen 
bolt  Beranek  A  Newman 
50  Moulton  Street 
Cambridge,  KA  02138 

1  DR.  ROBERT  GLASER 
LRDC 

UNIVERSITY  OF  PITTSBURGH 
2939  O'HARA  STREET 
PITTSBURGH,  PA  15213 

1  Dr.  Ross  Greene 
CTB/KcGraw  Hill 
Del  Monte  Research  Park 
Monterey,  CA  93940 

1  Dr.  Alan  Gross 

Center  for  Advanced  Study  in  Education 
City  University  of  New  York 
New  York,  NY  10036 

1  Dr.  Ron  Hambleton 
School  of  Education 
University  of  Massechusetts 
Amherst,  MA  01002 

1  Dr.  Chester  Harris 
School  of  Education 
University  of  California 
Santa  Barbara,  CA  93106 

1  Dr.  Lloyd  Humphreys 

Department  of  Psychology 
University  of  Illinois 
Champaign,  IL  61620 

1  Library 

HuraRRO/Western  Division 
27857  Berwick  Drive 
Carmel.  CA  93921 
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Mon  Govt 


1  Dr.  Steven  Hunk.i 

Department  of  Education 
University  of  Alberta 
Edmonton,  Alberta 
CANADA 

1  Dr .  F^rl  Hunt 

Dept,  of  Psychology 
University  of  Washington 
Seattle,  WA  98105 

1  Dr.  Huynh  Huynh 

Department  of  Education 
University  of  South  Carolina 
Columbia,  SC  29208 

1  Or.  Carl  J.  Jensema 
Callcudet  College 
vendall  Green 
Vlashington,  DC  2C002 

1  Dr.  Arnold  F.  Kanar ick 
Honeywell,  Inc. 

2600  Ridgeway  Pkwy 
Minneapolis,  MM  55413 

1  Dr.  John  A.  Keats 

University  of  Newcastle 
Newcastle,  New  South  Wales 
AUSTRALIA 

1  Mr.  Marlin  Kroger 
1117  Via  Goleta 

Palos  Verdes  Estates,  CA  90274 

1  LCOL.  C.R.J.  LAFLEUR 
PERSONNEL  APPLIED  RESEARCH 
NATIONAL  DEFENSE  HQS 
101  COLONEL  BY  DRIVE 
OTTAWA,  CANADA  K1A  0K2 

1  Dr.  Michael  Levine 

Department  of  Educational  Psychology 
University  of  Illinois 
Champaign,  IL  61820 


Mon  Govt 


1  Facultcit  Sociale  Wetenschappen 
Rijksuniversiteit  Groningen 
Oude  Boteringestraat 
Groningen 
METHERUNDS 

1  Dr .  Robert  Linn 

College  of  Education 
University  of  Illinois 
Urbana,  !L  61801 

1  Dr.  Frederick  M,  Lord 

Educational  Testing  Service 
Princeton,  HJ  C3540 

1  Dr.  Gary  lijrco 

Educational  Testing  Service 
Princeton,  NJ  08450 

1  Dr.  Scott  Maxwell 

Department  of  Psychology 
University  of  Houston 
Itouston,  TX  77025 

1  Dr.  Sam  Mayo 

Loyola  University  of  Chicago 
Chicago,  IL  60601 

1  Dr.  James  A.  Paulson 

Portland  State  University 
P.O.  Box  751 
Portland,  OB  97207 

1  HR.  LUIGI  PETRULLO 

2431  N.  EDGEWOOD  STREET 
ARLINGTON,  VA  22207 

1  DR.  STEVEN  H.  PINE 
4950  Douglas  Avenue 
Golden  Valley,  MN  55416 

1  DR.  DIANE  N.  RAMSEY<4CLEE 
R-K  RESEARCH  ft  SYSTEM  DESIGN 
3947  RIOGENONT  DRIVE 
MALIBU,  CA  90265 
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Mon  Govt 


1  HIM.  RET.  K.  RAUCH 
P  II  4 

BUNDESMINISTERIUM  DER  VERTEIDIGUNG 
POSTFACH  161 
53  BONN  1,  GERMANY 

1  Dr.  Peter  B.  Read 

Social  Science  Research  Council 
605  Third  Avenue 
Mew  York.  NY  10016 

1  Dr .  P.3rk  D.  Reckase 

Educational  Psychology  Dept. 
University  of  Mi ssour 1-Col uabia 
1?  Hill  Hall 
Colunbia ,  MO  65201 

1  Dr.  Andrew  M.  Rose 

American  Institutes  for  Research 
1055  Thomas  Jefferson  St.  NW 
Washington,  DC  20007 

1  Dr.  Leonard  L.  Rosenbaum,  Chairman 
Department  of  Psychology 
Montgomery  College 
Rockville,  KD  20850 

1  Dr.  Efnst  Z.  Rothkopf 
Bell  Laboratories 
500  Mountain  Avenue 
Hurray  Hill,  NJ  07974 

1  Dr.  Donald  Rubin 

Educational  Testing  Service 
Princeton,  NJ  03450 

1  Or.  Larry  Rudner 
Gallaudet  College 
Kendall  Green 
Washington,  DC  20002 

1  Dr.  J.  Ryan 

Department  of  Education 
university  of  South  Carolina 
Colunbia,  SC  29208 


(ton  Govt 


1  PROF.  FUMIKO  SAMEJIMA 
DEPT.  OF  PSYCHOLOGY 
UNIVERSITY  OF  TENNESSEE 
KNOXVILLE,  TN  37916 

1  Dr.  Kazao  Shigemasu 
University  of  Tohoku 
Department  of  Educational  Psycliology 
Kawciuchi ,  Sendai  932 
JAPAN 

1  Dr.  Richard  Snow 
Sciiool  of  Education 
Stanford  University 
Stanford,  CA  94305 

1  Dr.  Robert  Sternberg 
Dept,  of  Psychology 
Yale  University 
Box  11A,  Yale  Station 
New  Haven,  CT  06520 

1  DR.  PATRICK  3UPPES 

INSTITWE  FOR  MATHEMATICAL  STUDIES  IN 
THE  SOCIAL  SCIENCES 
STANFORD  UNIVERSITY 
STANFORD,  CA  94305 

1  Dr.  Hariharan  Swaminathan 

Laboratory  of  Psychometric  and 
Evaluation  Research 
School  of  Education 
university  of  Massachusetts 
Amherst',  HA  01003 

1  Dr.  Brad  Syropson 

Office  of  Data  Analysis  Research 
Educational  Testing  Service 
Princeton,  NJ  08541 

1  Dr.  Kikumi  Tatsuoka 

Computer  Based  Education  Research 
Laboratory 

252  Engineering  Research  Laboratory 
University  of  Illinois 
Urbsna,  IL  61801 
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1  Dr.  Maurice  Tatsuoka 

Department  of  Educational  Psycholo^^y 
University  of  Illinois 
Champaign,  IL  61801 

1  Or.  David  Thissen 

Department  of  Psyetnlogy 
University  of  Kansas 
Lawrence,  KS  66044 

1  Dr.  Robert  Tsutakawa 
Dept,  of  Statistics 
University  of  Missouri 
Colunbia,  MO  65201 

1  Dr.  J.  Uhlaner 

Pcrceptronies ,  Inc . 

6271  Variel  Avenue 
Woodland  Hills,  CA  91364 

1  Or.  Howard  Uainer 

F-ureau  of  Social  science  Research 
1990  M  street,  M.  W. 

Washington,  DC  20036 

1  DR.  THOMAS  WALLSTEN 

PSYCHOMETRIC  UBORATORY 
DAVIE  HALL  01 3A 
UNIVERSITY  OF  tKJRTH  CAROL 
CHAPEL  HILL,  NC  27514 


1  Dr.  J.  Arthur  Woodward 
Department  of  Psychology 
University  of  Colifornia 
Los  Angeles,  CA  90024 

1  Dr.  Robert  Woud 

School  Examination  Department 

University  of  London 

66-72  Gower  Street 

London  V.'C1E  CEE 

ENGLAt<ID 

1  Dr.  Karl  Zinn 

Center  for  research  on  Learning 
and  Teaching 
University  of  Michigan 
Ann  Arbor,  MI  48104 


1  Dr.  David  J.  Weiss 
N660  Elliott  Hall 
(Aiiversity  of  Minnesota 
75  E.  River  Road 
Minneapolis,  MN  55455 

1  DR.  SUSAN  E.  WHITELY 
PSYCHOLOGY  DEPARTMENT 
UNIVERSnY  OF  KANSAS 
UHREKCE,  KANSAS  66044 

1  Dr.  Wolfgang  HildgrUbe 
Str ei tkr at fteamt 
Rosenberg  5300 
Bonn,  Nest  Genisny  D-5300 


